Although Reinforcement Learning (RL) has shown impressive results in games and simulation, real-world application of RL suffers from its instability under changing environment conditions and hyperparameters. We give a first impression of the extent of this instability by showing that the hyperparameters found by automatic hyperparameter optimization (HPO) methods are not only dependent on the problem at hand, but even on how well the state describes the environment dynamics. Specifically, we show that agents in contextual RL require different hyperparameters if they are shown how environmental factors change. In addition, finding adequate hyperparameter configurations is not equally easy for both settings, further highlighting the need for research into how hyperparameters influence learning and generalization in RL.
translated by 谷歌翻译
无监督的强化学习(URL)的目标是在任务域中找到奖励无知的先验政策,以便改善了监督下游任务的样本效率。尽管在下游任务中进行填补时,以这种先前的政策初始化的代理商可以获得更高的奖励,但在实践中如何实现最佳预定的先前政策,这仍然是一个悬而未决的问题。在这项工作中,我们介绍PORTER(策略轨迹集合正规化) - 一种可以适用于任何URL算法的预处理的一般方法,并且在基于数据和知识的URL算法上特别有用。它利用了在预处理过程中发现的一系列政策合奏,并将URL算法的政策移至更接近其最佳先验的政策。我们的方法基于理论框架,我们分析了其对白盒基准测试的实际影响,使我们能够完全控制PORTER。在我们的主要实验中,我们评估了无监督的强化学习基准(URLB)的Polter,该实验由3个域中的12个任务组成。我们通过将各种基于数据和知识的URL算法的性能平均提高19%,在最佳情况下最多可达40%,从而证明了方法的普遍性。在与调谐的基线和调整的polter的公平比较下,我们在URLB上建立了最新的新作品。
translated by 谷歌翻译
Recurrent neural networks are a widely used class of neural architectures. They have, however, two shortcomings. First, they are often treated as black-box models and as such it is difficult to understand what exactly they learn as well as how they arrive at a particular prediction. Second, they tend to work poorly on sequences requiring long-term memorization, despite having this capacity in principle. We aim to address both shortcomings with a class of recurrent networks that use a stochastic state transition mechanism between cell applications. This mechanism, which we term state-regularization, makes RNNs transition between a finite set of learnable states. We evaluate state-regularized RNNs on (1) regular languages for the purpose of automata extraction; (2) non-regular languages such as balanced parentheses and palindromes where external memory is required; and (3) real-word sequence learning tasks for sentiment analysis, visual object recognition and text categorisation. We show that state-regularization (a) simplifies the extraction of finite state automata that display an RNN's state transition dynamic; (b) forces RNNs to operate more like automata with external memory and less like finite state machines, which potentiality leads to a more structural memory; (c) leads to better interpretability and explainability of RNNs by leveraging the probabilistic finite state transition mechanism over time steps.
translated by 谷歌翻译
Survival analysis is the branch of statistics that studies the relation between the characteristics of living entities and their respective survival times, taking into account the partial information held by censored cases. A good analysis can, for example, determine whether one medical treatment for a group of patients is better than another. With the rise of machine learning, survival analysis can be modeled as learning a function that maps studied patients to their survival times. To succeed with that, there are three crucial issues to be tackled. First, some patient data is censored: we do not know the true survival times for all patients. Second, data is scarce, which led past research to treat different illness types as domains in a multi-task setup. Third, there is the need for adaptation to new or extremely rare illness types, where little or no labels are available. In contrast to previous multi-task setups, we want to investigate how to efficiently adapt to a new survival target domain from multiple survival source domains. For this, we introduce a new survival metric and the corresponding discrepancy measure between survival distributions. These allow us to define domain adaptation for survival analysis while incorporating censored data, which would otherwise have to be dropped. Our experiments on two cancer data sets reveal a superb performance on target domains, a better treatment recommendation, and a weight matrix with a plausible explanation.
translated by 谷歌翻译
知识图(kgs)以(头,谓词,尾部) - 轨道的形式存储信息。为了增强具有新知识的公斤,研究人员提出了kg完成(kgc)任务的模型,例如链接预测;即,回答(H; P;?)或(?; P; t)查询。这种模型通常在固定测试集上使用平均指标进行评估。尽管对于跟踪进度有用,但平均的单分数指标无法透露模型到底学到的或未能学习的内容。为了解决这个问题,我们提出了KGXBoard:一个交互式框架,用于对有意义的数据子集进行精细颗粒评估,每个框架都测试了KGC模型的个人和可解释功能。在我们的实验中,我们强调了使用KGXBoard发现的发现,这是无法通过标准平均单分数指标来检测到的。
translated by 谷歌翻译
通过以人为本的研究(HCR),我们可以引导研究活动,以便研究结果对人类利益相关者(例如最终用户)有益。但是,是什么使研究以人为中心为中心?我们通过提供工作定义来解决这个问题,并定义如何将研究管道分为不同的阶段,在这些阶段中可以添加以人为中心的组件。此外,我们使用HCR组件讨论了现有的NLP,并定义了一系列的指导问题,这些问题可以作为有兴趣探索以人为中心的研究方法的研究人员的起点。我们希望这项工作能够激发研究人员完善所提出的定义,并提出其他对实现HCR有意义的问题。
translated by 谷歌翻译
神经网络在许多医学成像任务中都取得了令人印象深刻的结果,但在源自不同医疗中心或患者同类的分布数据集中通常会表现出色。评估这种缺乏概括和解决潜在问题的能力是开发旨在临床实践的神经网络的两个主要挑战。在这项研究中,我们开发了一种新方法,用于评估神经网络模型通过生成大量分配移位数据集的概括能力,可用于彻底研究其对临床实践中遇到的可变性的鲁棒性。与外部验证相比,\ textit {移位评估}可以提供有关为什么在给定数据集上神经网络失败的解释,从而为如何改善模型鲁棒性提供指导。随着评估的转变,我们证明了接受最先进方法训练的神经网络对于甚至从训练数据中的分配很小的转移而高度脆弱,并且在某些情况下会失去所有歧视能力。为了解决这一脆弱性,我们制定了一种增强策略,该策略明确旨在提高神经网络对分配转移的稳健性。 \ texttt {strongaugment}通过大规模的,异构的组织病理学数据进行评估,其中包括来自两种组织类型的五个培训数据集,274个分配切换的数据集和来自四个国家 /地区的20个外部数据集。接受\ texttt {strongaugment}培训的神经网络在所有数据集上都保持相似的性能,即使通过分配变化,使用当前最新方法训练的网络将失去所有歧视能力。我们建议使用强大的增强和转移评估来训练和评估所有用于临床实践的神经网络。
translated by 谷歌翻译
随着现实应用程序中AI系统的兴起,需要可靠和值得信赖的AI。一个基本方面是可解释的AI系统。但是,关于应如何评估可解释的AI系统的商定标准。受图灵测试的启发,我们引入了一个以人为本的评估框架,领先的领域专家接受或拒绝AI系统和另一个领域专家的解决方案。通过比较提供的解决方案的接受率,我们可以评估AI系统与域专家相比的性能,以及AI系统的解释(如果提供)是否可以理解。该设置与图灵测试相当 - 可以作为各种以人为中心的AI系统评估的框架。我们通过提出两个实例来证明这一点:(1)评估系统的分类准确性,可以选择合并标签不确定性; (2)评估以人为中心确定提供的解释的有用性。
translated by 谷歌翻译
许多用于医学成像的最新神经网络概括到培训期间无奈的数据不佳。这种行为可以是由网络过度易于学习或统计主导,在忽视其他可能的信息性功能的情况下引起的。例如,来自两个不同扫描仪的图像锐度的无法区分差异可以显着降低网络的性能。旨在临床实践的所有神经网络都需要强大地对由成像设备,样品制备和患者群体的差异引起的数据的变化。为解决这些挑战,我们评估光谱解耦作为隐含偏置的效用。光谱分离促使神经网络通过简单地规则地规范网络的无正常预测分数来了解更多特征,从而没有增加计算成本。我们表明光谱解耦允许培训具有强虚假相关性的数据集上的神经网络,并增加网络对数据分布班次的鲁棒性。为了验证我们的调查结果,我们用培训网络与无光谱去耦,以检测胸部射线照片中的前列腺癌组织载玻片和Covid-19。培训的网络培训,在外部数据集上达到高达9.5%的表现更高。我们的研究结果表明,光谱解耦有助于与神经网络相关的泛化问题,并且可用于补充或更换计算昂贵的明确偏置缓解方法,例如在组织学图像中染色归一化。我们建议使用光谱解耦作为用于临床用途的任何神经网络中的隐含偏置缓解方法。
translated by 谷歌翻译
In the last few years, Artificial Intelligence (AI) has achieved a notable momentum that, if harnessed appropriately, may deliver the best of expectations over many application sectors across the field. For this to occur shortly in Machine Learning, the entire community stands in front of the barrier of explainability, an inherent problem of the latest techniques brought by sub-symbolism (e.g. ensembles or Deep Neural Networks) that were not present in the last hype of AI (namely, expert systems and rule based models). Paradigms underlying this problem fall within the so-called eXplainable AI (XAI) field, which is widely acknowledged as a crucial feature for the practical deployment of AI models. The overview presented in this article examines the existing literature and contributions already done in the field of XAI, including a prospect toward what is yet to be reached. For this purpose we summarize previous efforts made to define explainability in Machine Learning, establishing a novel definition of explainable Machine Learning that covers such prior conceptual propositions with a major focus on the audience for which the explainability is sought. Departing from this definition, we propose and discuss about a taxonomy of recent contributions related to the explainability of different Machine Learning models, including those aimed at explaining Deep Learning methods for which a second dedicated taxonomy is built and examined in detail. This critical literature analysis serves as the motivating background for a series of challenges faced by XAI, such as the interesting crossroads of data fusion and explainability. Our prospects lead toward the concept of Responsible Artificial Intelligence, namely, a methodology for the large-scale implementation of AI methods in real organizations with fairness, model explainability and accountability at its core. Our ultimate goal is to provide newcomers to the field of XAI with a thorough taxonomy that can serve as reference material in order to stimulate future research advances, but also to encourage experts and professionals from other disciplines to embrace the benefits of AI in their activity sectors, without any prior bias for its lack of interpretability.
translated by 谷歌翻译